Systems and Methods for Subject Identification (ID) Modeling

ABSTRACT

Systems and methods for subject identification (ID) modeling are disclosed. A subject identification may be associated with information contained in one or more core domains such as a patient domain, a country domain, and/or an investigator domain. The domains can be generically designed such that data sources that are unknown at the time the domains are created can be managed. In this way, using generic structures that support the domains, data sources can be added and/or updated as additional information and/or data sources become available. Using various graphical user interfaces, a user can dynamically associate patient criteria, country criteria, investigator criteria, and/or other information with subject identifications. A subject identification may be associated with a specified capture date. Information contained in the various domains may be filtered such that only information contained in the domain on or before the capture date is available for the subject identification.

CROSS-REFERENCE TO RELATED APPLICATION

This application claims priority to U.S. Provisional Application No. 61/663,292, filed on Jun. 22, 2012, entitled “Method and System to Manipulate Multiple Selections against a Population of Elements;” U.S. Provisional Application No. 61/663,057, filed on Jun. 22, 2012, entitled “Systems and Methods For Predictive Analytics For Site Initiation and Patient Enrollment;” U.S. Provisional Application No. 61/663,299, filed on Jun. 22, 2012, entitled “Methods and Systems for Predictive Clinical Planning and Design and integrated Execution Services;” U.S. Provisional Application No. 61/663,398, filed on Jun. 22, 2012, entitled “Systems and Methods for Subject Identification (ID) Modeling;” U.S. Provisional Application No. 61/663,219, filed Jun. 22, 2012, entitled “Systems and Methods for Analytics on Viable Patient Populations;” U.S. Provisional Application No. 61/663,357, filed Jun. 22, 2012; entitled “Methods and Systems for a Clinical Trial Development Platform;” U.S. Provisional Application No. 61/663,216, filed Jun. 22, 2012; entitled “Systems and Methods for Data Visualization.” The entirety of all of which is hereby incorporated by reference herein.

FIELD OF THE INVENTION

The present invention relates generally to systems and methods for the creation and analysis of data associated with clinical trials. The present invention relates more specifically to systems and methods for subject identification (ID) modeling.

BACKGROUND

Clinical trials for molecules that may become pharmaceutical products often last for years. The core cost of the that is affected primarily by the length of the trial. And a delay of even a single day can cost hundreds or thousands and even millions of dollars.

Data associated with clinical trials is often associated with various data sources and may be highly diverse. Highly diverse data from numerous data sources is often difficult to organize, assemble, and analyze. Therefore, systems and methods for the collation of highly diverse data into usable data would be advantageous. Furthermore, systems and methods for dynamic analysis to support better understanding of the impacts of decisions against clinical trials would be advantageous.

SUMMARY

Embodiments of the present invention provide systems and methods for subject identification (ID) modeling. In one embodiment, raw data is processed to an application using a tool that enables a user to build subject identifications dynamically. In some embodiments, such subject IDs can be created by a user without technical expertise.

In one embodiment, a subject identification can be created dynamically. For example, a user can interact with a user interface to create a subject by entering or selecting a subject name and a molecule team. In one embodiment, a unique subject identification ID is automatically created or assigned once a subject name and a molecule team have been entered or selected.

A subject identification may be associated with information contained in other tables and/or databases. For example, in one embodiment, subject identifications may be associated with information contained in one or more core domains such as a patient domain, a country domain, and/or an investigator domain. In some embodiments, one or more domains are generically designed such that data sources that are unknown at the time the domain(s) are created can be managed. In this way, using a generic structure that supports a domain, data sources can be added and/or updated as additional information and/or data sources become available. Exemplary models that depict generic structures which support such domains are disclosed herein and variations are within the scope of this disclosure.

Using various graphical user interfaces, a user can dynamically associate patient criteria, country criteria, investigator criteria, and/or other information with subject identifications. In some embodiments, as the user interacts with the graphical user interfaces, associations between subject identifications and data in other tables and/or databases is dynamically updated in real-time or substantially real time. For example, when a user selects an indicator to be associated with a particular subject identification, the association may be created. As another example, when a user selects various indicators to be associated with a subject identification and then clicks an update button on the graphical user interface, the selected indicators may be dynamically associated with the subject identification.

Information associated with a particular subject identification may be frozen at a particular date and/or time. For example, a subject identification may be associated with a specified capture date. In this embodiment, information contained in the various domains may be filtered such that only information contained in the domain on or before the capture date is available for the subject identification. In this way, information for a data model may be updated as additional information for the data model becomes available but the information available to a particular subject identification can be limited to a static point in time.

These embodiments are mentioned not to limit or define the invention, but to provide an example of an embodiment of the invention to aid understanding thereof. Embodiments are discussed in the Detailed Description, and further description of the invention is provided there. Advantages offered by the various embodiments of the present invention may be further understood by examining this specification.

BRIEF DESCRIPTION OF THE FIGURES

These and other features, aspects, and advantages of the present invention are better understood when the following Detailed Description is read with reference to the accompanying drawings, wherein:

FIG. 1 is a block diagram illustrating an exemplary environment for implementation of one embodiment of the present invention;

FIG. 2 is a flowchart illustrating a method for dynamically creating and/or updating information associated with subject identifications according to one embodiment of the present invention;

FIG. 3 is a partial entity-relationship diagram illustrating how a subject ID is linked to complex and/or varied data sets from multiple sources according to one embodiment of the present invention;

FIG. 4 is a screen shot of a subject table editor according to one embodiment of the present invention;

FIG. 5 is a screen shot of patient selection according to one embodiment of the present invention;

FIG. 6 is a screen shot of a data source table editor according to one embodiment of the present invention;

FIG. 7 is a screen shot of investigators for investigator performance loads according to embodiments of the present invention;

FIG. 8 is a screen shot of investigators for investigator performance loads according to embodiments of the present invention;

FIG. 9 is a screen shot of investigators for investigator performance loads according to embodiments of the present invention;

FIG. 10 is a screen shot of a patient prevalence editor according to one embodiment of the present invention

FIG. 11 is an exemplary investigator data model according to one embodiment of the present invention;

FIG. 12 is an exemplary country data model according to one embodiment of the present invention; and

FIG. 13 is an exemplary patient data model according to one embodiment of the present invention.

DETAILED DESCRIPTION

Embodiments of the present invention provide systems and methods for subject identification (ID) modeling.

Illustrative Embodiment of the Present Invention

One illustrative embodiment of the present invention comprises an application for creating and/or updating subject identification (ID) models for clinical trials. The embodiment allows a user to access an application that presents a variety of clinical trial-related parameters for various patients, countries, and/or investigators. These parameters may include, for example, the population of a country, the regulatory environment, and/or the level of risk associated with conducting a trial in a particular country.

Using various graphical user interfaces associated with the application, a user can create subject identifications and/or select parameters related to patients, countries, and/or investigators for subject identifications. For example, in one embodiment, a user can select patients associated with one or more ICD9 codes for a particular subject identification. As another example, a user can add or remove various indicators—such as Ace Inhibitors, Acne, etc.—and/or phases and/or trials.

As the user interacts with the graphical user interface associations with data contained in various databases are added, removed, or updated. In some embodiments, such associations may be added, removed, or updated dynamically without requiring technical expertise regarding the underlying data structures. For example, a user can select one or more ICD9 codes to associate with a particular subject identification. In one embodiment. When a user selects an ICD9 code, associations between the selected ICD9 code, patients corresponding to the ICD9 code, and/or the subject identification are created. Similarly, when a user deselects an ICD9 code, associations between subject identifications and information in other tables and/or databases may be updated or removed.

In the illustrative embodiment, for various investigators, the user is able to specify investigator-specific parameters, such as indicators, phases, trials and/or other relevant parameters. The process is iterative; the user is able to change the parameters for patients, countries and/or investigators to determine the most appropriate sites to utilize for a clinical trial. As the user changes these parameters, information and/or associations corresponding to subject identifications and/or data structures may be dynamically added, removed, or updated. The results of the user's selections can then be used as part of a larger clinical trial analysis application.

This illustrative embodiment neither limits nor defines the invention. Rather, the illustrative embodiment is meant to provide an example of how the present invention may be implemented.

Illustrative Environment

Referring now to the drawings, in which like numerals indicate like elements throughout the several figures, FIG. 1 is a block diagram illustrating an exemplary environment for implementation of one embodiment of the present invention. The embodiment shown in FIG. 1 includes a client 100 that allows a user to interface with an application server 200, web server 300, and/or database 400 via a network 500.

The client 100 may be, for example, a personal computer (PC), such as a laptop or desktop computer, which includes a processor and a computer-readable media. The client 100 also includes user input devices, such as a keyboard and mouse or touch screen, and one or more output devices, such as a display. In some embodiments of the invention, the user of client 100 accesses an application or applications specific to one embodiment of the invention. In other embodiments, the user accesses a standard application, such as a web browser on client 100, to access applications running on a server such as application server 200, web server 300, or database 400. For example, in one embodiment, in the memory of client 100 are stored applications including a design studio application for planning and designing clinical trials. The client 100 may also be referred to as a terminal in some embodiments of the present invention.

Such applications may be resident in any suitable computer-readable medium and executable on any suitable processor. Such processors may comprise, for example, a microprocessor, an ASIC, a state machine, or other processor, and can be any of a number of computer processors, such as processors from Intel Corporation, Advanced Micro Devices Incorporated, and Motorola Corporation. The computer-readable media stores instructions that, when executed by the processor, cause the processor to perform the steps described herein.

The client 100 provides a software layer, which is the interface through which the user interacts with the system by receiving and displaying data to and from the user. In one embodiment, the software layer is implemented in the programming language C# (also referred to as C Sharp). In other embodiments, the software layer can be implemented in other languages such as Java or C++. The software layer may be graphical in nature, using visual representations of data to communicate said data to one or more users. The visual representations of data may also be used to receive additional data from one or more users. In one embodiment, the visual representation appears as a spider-like layout of nodes and connectors extending from each node to a central node.

Embodiments of computer-readable media comprise, but are not limited to, an electronic, optical, magnetic, or other storage device, transmission device, or other device that comprises some type of storage and that is capable of providing a processor with computer-readable instructions. Other examples of suitable media comprise, but are not limited to, a floppy disk, CD-ROM, DVD, magnetic disk, memory chip, ROM, RAM, PROM, EPROM, EEPROM, an ASIC, a configured processor, all optical media, all magnetic tape or other magnetic media, or any other medium from which a computer processor can read instructions. Also, various other forms of computer-readable media may be embedded in devices that may transmit or carry instructions to a computer, including a router, private or public network, or other transmission device or channel, both wired and wireless. The instructions may comprise code from any suitable computer-programming language, including, for example, C, C#, Visual Basic, Java, Python, Perl, and JavaScript.

The application server 200 also comprises a processor and a memory. The application server may execute business logic or other shared processes. The application server may be, for example, a Microsoft Windows Server operating in a NET framework, an IBM Weblogic server, or a Java Enterprise Edition (J2E) server. While the application server 200 is shown as a single server, the application server 200, and the other servers 300, 400 shown may be combined or may include multiple servers operating together to perform various processes. In such embodiments, techniques such as clustering or high availability clustering may be used. Benefits to architectures such as these include redundancy and performance, among others.

In the embodiment shown in FIG. 1, the application server 200 is in communication with a web server 300 via a network connection 250. The web server 300 also comprises a processor and a memory. In the memory are stored applications including web server software. Examples of web server software include Microsoft Internet Information Services (IIS), Apache Web Server, and Sun Java System Web Server from Oracle, among others.

In the embodiment shown in FIG. 1, the web server 300 is in communication with a database 400 via a network connection 350 and a network connection 450. The web server 300 provides a web service layer that, together or separate from application server 200, acts as middleware between a database 400 and the software layer, represented by the client 100. The web server 300 communicates with the database 400 to send and retrieve data to and from the database 400.

The network 500 may be any of a number of public or private networks, including, for example, the Internet, a local area network (“LAN”), or a wide area network (“WAN”). The network connections 150, 250, 350, and 450 may be wired or wireless networks and may use any known protocol or standard, including TCP/IP, UDP, multicast, 802.11b, 802.11g, 802.11n, or any other known protocol or standard. Further, the network 100 may represent a single network or different networks. As would be clear to one of skill in the art, the client 100, servers 200, 300, and database 400 may be in communication with each other over the network or directly with one another.

The database 400 may be one or a plurality of databases that store electronically encoded information comprising the data required to plan, design, and execute a clinical trial. In one embodiment, the data comprises one or more design elements conesponding to the various elements related to one or more clinical trials. The database 400 may be implemented as any known database, including an SQL database or an object database. Further, the database software may be any known database software, such as Microsoft SQL Server, Oracle Database, MySQL, Sybase, or others.

Illustrative Process for Adding/Updating Information for Subject Identifications

FIG. 2 is a flowchart illustrating a method for dynamically creating and/or updating information associated with subject identifications according to one embodiment of the present invention.

The method 200 shown in FIG. 2 begins when a subject is created or selected 210. For example, an application may comprise a user interface for creating or selecting a subject. In one embodiment, a subject can be created by entering or selecting a subject name and/or a molecule team. In such an embodiment, a unique subject identification corresponding to the entered or selected subject name and/or molecule team may be dynamically created. For example, in one embodiment, when a cursor moves off a particular row, data associated with the row may be added or updated in a database and/or table containing information for subjects. In one embodiment, a message is shown in the application that indicates whether the database and/or table were successfully modified.

After creating or selecting a subject 210, the method 200 proceeds to block 220. In block 220, patient parameters are received. For example, codes corresponding to particular diseases and/or illnesses, such as ICD9 or ICD10 codes, may be received through a graphical user interface of the application. Based on the selected patient parameters, a number of patients in one or more databases that meet the selected criteria may be displayed in the application. In some embodiments, information regarding patients and/or patient parameters may be stored in multiple databases and/or tables. In such an embodiment, the application may dynamically create or update associations between the various databases and/or tables containing information corresponding to patients and selected subject identification(s). For example, patient information may be stored in one or more patient databases corresponding to one or more generic patient data models. In such an embodiment, associations between the patient database(s) and subject may be dynamically created, modified, or removed based on selected or deselected patient parameters.

After patient parameters are received 220, the method 200 proceeds to block 230. In block 230, investigator parameters are received. For example, various indicators, phases, and/or trials may be selected or removed for one or more subject identifications. In some embodiments, information regarding investigators and/or investigators parameters may be stored in multiple databases and/or tables. In such an embodiment, the application may dynamically create or update associations between the various databases and/or tables containing information corresponding to investigators and selected subject identification(s). For example, investigator information may be stored in one or more investigator databases corresponding to one or more generic investigator data models. In such an embodiment, associations between the investigator database(s) and subjects may be dynamically created, modified, or removed based on selected or deselected investigator parameters.

After investigator parameters are received 230, the method 200 proceeds to block 240. In block 240, country parameters are received. In some embodiments, information regarding countries and/or country parameters may be stored in multiple databases and/or tables. In such an embodiment, the application may dynamically create or update associations between the various databases and/or tables containing information corresponding to countries and selected subject identification(s). For example, country information may be stored in one or more investigator databases corresponding to one or more generic country data models. In such an embodiment, associations between the country database(s) and subjects may be dynamically created, modified, or removed based on selected or deselected country parameters.

After country parameters are received 240, the method 200 proceeds to block 250. In block 250, a capture date is received. Based on the capture date for a particular subject, the application may update the subject identification such that only information associated with the databases on or before the capture date can be included in any data analysis associated with the subject. For example, one or more databases associated with a subject may periodically be updated as new information becomes available. In this embodiment, even if the database is updated, the information available for a selected subject may be limited to information associated with the database on or before the capture date associated with the subject. In this way, a static snapshot of data for a particular subject can be maintained. While allowing databases to continue to receive additional data as it becomes available. In some embodiments, the additional information may be used by other subjects that either do not have a specified capture date or that have a capture date that is after the information is received. Numerous other embodiments are disclosed herein and variations are within the scope of this disclosure.

Illustrative Associations for Subject ID and Data Sets

FIG. 3 is a partial entity-relationship diagram illustrating how a subject ID is linked to complex and/or varied data sets from multiple sources according to one embodiment of the present invention. In the embodiment shown in FIG. 3, subject identifications (IDs), subject names, molecule teams, and/or capture dates are stored in a subject table. In this embodiment, a unique subject ID may be selected or assigned for each subject. In one embodiment, one or more of the subject name, molecule team, and/or capture date may be optional. In other embodiments, one or more of the subject name, molecule team, and/or capture date may be required. In still other embodiments, additional information for a subject ID may be optional or required. Referring back to FIG. 3, numerous tables contain data that corresponds to subject IDs in the subject table. For example, in FIG. 3, a subject ID in the Subject (Dim) table may correspond with a subject IL) in the Staffing (Fact) table. In this embodiment, the subject ID in the Staffing (Fact) table corresponds with a country identification. In other embodiments, information such as staffing, cycle times, monthly recruitment, trial saturation, patient prevalence, patient information, patient morbidity, patient concomitant information, investigator performance information, and/or investigator trial information—may be associated with a subject identification (ID) in a subject table. Numerous other embodiments are disclosed herein and variations are within the scope of this disclosure.

Data Steward Application and Screen Shots

Data Steward is a software tool that can be used to directly modify databases according to embodiments of the present invention. Appendix A, which is hereby incorporated by reference in its entirety, comprises a User Guide for a Data Steward according to embodiments of the present invention.

FIG. 4 is a screen shot of a subject table editor according to one embodiment of the present invention. In the subject table editor interface of the Data Steward application, a user can sort information displayed in the subject table editor interface based on subject identifications, subject names, molecule teams, capture dates, whether the subject ID is located, whether a subject ID has been processed, descriptions, and/or other information associated with a subject ID. In the embodiment shown in FIG. 4, a user can lock or unlock a particular subject ID by selecting or deselecting a checkbox con to a particular subject ID. Likewise, in FIG. 4, a user can selected whether or not a subject is processed by selecting or deselecting a checkbox corresponding to a particular subject ID. Numerous other embodiments are disclosed herein and variations are within the scope of this disclosure.

FIG. 5 is a screen shot of patient selection according to one embodiment of the present invention. In the patient selection interface of the Data Steward application, a user can select ICD codes for a subject such that matching patient records can be identified. In the embodiment shown in FIG. 5, a user can select a subject identification and a data source for the subject identification. In this embodiment, a user can add and/or remove ICD9 codes. In other embodiments, other information which identifies a disease, illness, or other intbrmation usable to filter patient records may be selected. In the embodiment shown in FIG. 5, information regarding patient counts, such as patient counts in fact stage and/or patient counts in facts, may be displayed based at least in part on the selected information. For example, in FIG. 5, an Electronic Health Records (EHR) data source has been selected as well as various ICD9 codes. In this embodiment, a patient count, condition count, and medication count are displayed based at least in part on the selected data source and the selected ICD9 codes. Numerous other embodiments are disclosed herein and variations are within the scope of this disclosure.

FIG. 6 is a screen shot of a data source table editor according to one embodiment of the present invention. In the embodiment shown in FIG. 7, information corresponding to various data sources is displayed. For example, a data source in the DataSource Tables Editor can have a corresponding data source identification, data source name, description, create date, file date, and/or data source owner. In other embodiments, additional or less information for one or more data sources is shown in the DataSource Tables Editor user interface. Numerous other embodiments are disclosed herein and variations are within the scope of this disclosure.

FIGS. 7, 8, and 9 are screen shots of investigators for investigator performance loads according to embodiments of the present invention. These screenshots illustrate performance load criteria that may be selected or removed for a particular subject identification according to embodiments. For example, various indicators, phases, and/or trials may be selected or removed for one or more subject identifications. Numerous other embodiments are disclosed herein and variations are within the scope of this disclosure.

FIG. 10 is a screen shot of a patient prevalence editor according to one embodiment of the present invention. In the embodiment shown in FIG. 10, a user can filter information from various databases based on one or more of the following: subjects, data sources, countries, country population, prevalence, prevalence factor, prevalence per population, and/or supporting evidence. Data displayed in the graphical user interface of the Data Steward application may be sorted by subject, data source, country name, country population, prevalence, prevalence rate, prevalence per population, and/or supporting evidence. In the embodiment shown in FIG. 10, a column header can be dragged to a particular location of the user interface to group information by that column. Numerous other embodiments are disclosed herein and variations are within the scope of this disclosure.

Exemplary Investigator Data Model

FIG. 11 is an exemplary investigator data model according to one embodiment of the present invention. In the embodiment shown in FIG. 11, a FactInvestigatorPerformance table contains information associated with the patient randomization, enrollment rates, and failure rates as well as information associated with investigator identifications, subject identifications, data source identifications, and location identifications which correspond with additional information contained in other tables and/or databases. For example, a particular subject identification (SubjectID) in the FactInvestigatorPerformance table may correspond with a SubjectID in the DIM.Subject table.

By using the SubjectID in the FactinvestigatorPerformance table to query the SubjectID in the DIM.Subject table, additional information such as the subject name, molecule team, and/or capture can be determined for the SuhiectID contained in the FactinvestigatorPerformance table. In some embodiments, information contained in various tables and/or databases may be linked in a chain. For example, a DataSourceID in the FactinvestigatorPerformance table correspond with a DataSourceID in the Dim.Dim.DataSource table. The Dim.Dim.DataSource table, in turn, may contain a DataSourceOwnerID fro a particular DataSourceID which corresponds with a DataSourceOwnerID in the Dim.DataSourceOwner table. Thus, by querying the various tables and/or databases, a DataSourceID in the FactinvestigatorPerformance table can be used to determine information such as the LastModifiedDate in the Dim. DataSourceOwner table for the DataSource associated with the DataSourceID. Numerous other embodiments are disclosed herein and variations are within the scope of this disclosure.

Below is a description of the various tables of an investigator data model according to one embodiment of the present invention:

List of Tables in Investigator Data Model Name Comment DataSourceOwner DIM.Country Conformed country dimension consisting of ISO-3166 standards and minor manual updates to streamline country presentation data for Semio. Dim.DataSource Dim.Location Location dimension table contains location information about country, state, city, latitude, and longitude. DIM.Subject Subject table is the dimensional table used to snapshot fact data by molecule team, search phrase, and date. Fact.InvestigatorPerformance Fact information about Investigator performance.

Column(s) of “DataSourceOwner” Table Is Is Name Datatype Comment PK FK DataSourceOwnerID int Yes No DataSourceOwner nvarchar(50) No No CreationDate datetime No No LastModifiedDate datetime No No LastModifiedByID varbinary(85) No No IsActiveFlag bit No No

Column(s) of “DIM.Country” Table Is Is Name Datatype Comment PK FK CountryID integer Country Identification Yes No Number. The primary key CountryName varchar( ) Country Description No No Abbrevia- nchar(2) Two letter No No tionSmall abbreviation name of the country Abbrevia- nchar(3) Three letters name No No tionLarge of the country IsActiveFlag char(18) bit indicator for the No No validity of the record IsInferredFlag char(18) No No AuditETLID char(18) Reference to Audit. No No ExecutionLog key for auditing FIPS nvarchar(255) Two lettres code for No No the country GMI nvarchar(255) Three lettres code for No No the country Population bigint Population of the No No country SQKM float The total square No No kilometer of a country SQMI float The total miles of a No No country Geometery geometry Geomatrical informa- No No tion of the country LandLocked char(1) The country has a No No landlocked or not? CTReferenceCode varchar(10) Clinical Trial No No reference code Latitude decimal(19, 12) Latitude information No No of the country Longitude decimal(19, 12) Longitude information No No of the country ISOName nvarchar(255) No No

Column(s) of “Dim.DataSource” Table Is Is Name Datatype Comment PK FK DataSourceID int Data Source Yes No Unique Identity Number DataSourceName nvarchar(50) Name of the Data No No Source Provider DataSourceOwner Nvarchar(100) Name of the Data No No Source Owner DataSourceDe- nvarchar(255) Description of No No scription the Data Source CreationDate datetime The date on No No which the data is being inserted LastModifiedDate datetime Last modified No No date LastModifiedByID varbinary(85) Last modified by No No Identity number IsActiveFlag bit Bit indicator for No No the validity of the record DataSourceOwnerID int No Yes

Column(s) of “Dim.Location” Table Is Is Name Datatype Comment PK FK LocationID int Location identity number Yes No CountryID integer Country identity number. No Yes Foriegn key referenced from Dim.Country table. State nvarchar(30) State information No No City nvarchar(30) City information No No Latitude float(19, 12) Latitude information No No Longitude float(19, 12) Longitude information No No GeoNameID int Geographical detail No No about a location

Column(s) of “DIM.Subject” Table Is Is Name Datatype Comment PK FK SubjectID integer Subject Identification Yes No SubjectName varchar(20) Subject Name No No Molecule_Team char(18) Molecule team No No information CaptureDate char(18) No No

Column(s) of “Fact.InvestigatorPerformance” Table Is Is Name Datatype Comment PK FK InvestigatorID int Investigator Identity. The Yes No primary key in the table. SubjectID integer Subject Identification. Yes Yes Foreign key from Subject Dimension Table. DataSourceID int Data Source Unique Yes Yes Identity Number, apperaing in this table as foreign key LocationID int Location identity number No Yes TotalTrials int How many trials No No Investigator & Site participated in this particular indication SiteStartupCyleTime int Average time for No No Investigator & Site to open enrollment after the contract signed EnrolledINLast16Months bit Number of enrollment No No for the last 16 months PatientRandomizedMedian decimal(19, 12) Median randamization of No No patients across all trials for the indication. PatientRandomized int Average randamization No No of patients across all trials for the indication. PatientRandomizedMaximum decimal(19, 12) Maximum randamization No No of patients across all trials for the indication. EnrollmentRateMonthlyMean decimal(19, 12) No No EnrollmentRateMonthly int Enrollment rate per No No month EnrollmentRateMonthlyMeadian decimal(19, 12) Median Enrollment Rate No No per month EnrollmentRateMonthlyStandardDev decimal(19, 12) Standard Deviation No No Monthly Data. PatientScreened int The number of patient No No screened for this indication. ScreenFailureRate decimal(19, 12) Percentage of patients No No that were unable to participate due to failed screening. DropoutRate decimal(19, 12) Percentage of enrolled No No patients who dropped out from the trial. QueryRate int Number of queries per No No 100 pages of CRFs InvestigatorEnrollmentFactor int Calculated performence No No ranking of the Investigators. IsActiveFlag bit No No AuditETLFlag int No No SiteStartUpCycleTimeStandardDev decimal(19, 12) Standard Deviation for No No Investigator & Site to open enrollment after the contract signed

Exemplary Country Data Model

FIG. 12 is an exemplary country data model according to one embodiment of the present invention. In the embodiment shown in FIG. 13, each country is associated with a unique country identification (CountryID). In this embodiment, the CountryID is associated with other information such as a country name, country abbreviations, population, and/or GPS coordinates. The CountryID for a particular country may be associated with information contained in other tables and/or databases. For example, in the FactPatientPrevalence table shown in FIG. 12, a CountryID and a subject identification (SubjectID) may be used to determine a prevalence rate, prevalence per population, and incidence per population. As another example, a Country and a SubjectID may be used to query a FactTrialSaturation table to determine an active trial count. The embodiment shown in FIG. 12 depicts numerous other associations between countryIDs and information in other tables and/or databases. Numerous other embodiments are disclosed herein and variations are within the scope of this disclosure.

Below is a description of the various tables of a country data model according to one embodiment of the present invention:

List of Tables in Country Data Model Name Comment DataSourceOwner DIM.Country Conformed country dimension consisting of ISO-3166 standards and minor manual updates to streamline country presentation data for Semio. Dim.CountryCycleTime Dim.DataSource DIM.Subject Subject table is the dimensional table used to snapshot fact data by molecule team, search phrase, and date. Fact.CycleTime Fact table containing cycle time information for a given Country, Subject, and type (CT Materials, throughput, etc) Fact.MonthlyRecruitment Fact table containing cycle time information for a given Country, Subject, and type (CT Materials, throughput, etc) Fact.PatientPrevalence Contains the prevalence of a particular disease condition (captured in SubjectID) for a given Country (countryID) Fact.Staffing Fact table containing staff information for a given Country, Subject, and type. Fact.TrialSaturation By Country, By Subject (molecule team + search phrase) - the number of active trials.

Column(s) of “DataSourceOwner” Table Name Is PK Is FK Comment DataSourceOwnerID Yes No DataSourceOwner No No CreationDate No No LastModifiedDate No No LastModifiedByID No No IsActiveFlag No No

Column(s) of “DIM.Country” Table Name Is PK Is FK Comment CountryID Yes No Country Identification Number. The primary key CountryName No No Country Description AbbreviationSmall No No Two letter abbreviation name of the country AbbreviationLarge No No Three letters name of the country IsActiveFlag No No bit indicator for the validity of the record IsInferredFlag No No AuditETLID No No Reference to Audit. ExecutionLog key for auditing FIPS No No Two lettres code for the country GMI No No Three lettres code for the country Population No No Population of the country SQKM No No The total square kilometer of a country SQMI No No The total miles of a country Geometery No No Geomatrical information of the country Landlocked No No The country has a landlocked or not? CTReferenceCode No No Clinical Trial reference code Latitude No No Latitude information of the country Longitude No No Longitude information of the country ISOName No No

Column(s) of “Dim.CountryCycleTime” Table Name Is PK Is FK Comment CountryID Yes No IterationID Yes No CycleTimeTypeID Yes No DataSourceID Yes No CycleTimeDecimal No No CycleTimeStandardDev No No TheorecticalApprovalTimeInDays No No AverageApprovalTimeInDays No No SupportingEvidence No No IsActiveFlag No No AuditETLID No No CreatedDate No No ModifiedDate No No Modifiedby No No

Column(s) of “Dim.DataSource” Table Name Is PK Is FK Comment DataSourceID Yes No Data Source Unique Identity Number DataSourceName No No Name of the Data Source Provider DataSourceOwner No No Name of the Data Source Owner DataSourceDescription No No Description of the Data Source CreationDate No No The date on which the data is being inserted LastModifiedDate No No Last modified date LastModifiedByID No No Last modified by Identity number IsActiveFlag No No Bit indicator for the validity of the record DataSourceOwnerID No Yes

Column(s) of “DIM.Subject” Table Name Is PK Is FK Comment SubjectID Yes No Subject Identification SubjectName No No Subject Name MoleculeTeam No No Molecule team information CaptureDate No No The date data was inserted in the database IsLocked No No IsActiveVariableFlag No No ProcessFlag No No SubjectDetailDescription No No

Column(s) of “Fact.CycleTime” Table Name Is PK Is FK Comment CountryID Yes Yes Country Identification Number. The primary key SubjectID Yes Yes Subject Identification CycleTimeTypeID Yes Yes DataSourceID Yes Yes Data Source Unique Identity Number CycleTime No No Cycle time information SupportiveEvidence No No The xml document containing the supportive information. IsActiveFlag No No bit indicator for the validity of the record AuditETLID No No Reference to Audit. ExecutionLog key for auditing TheorecticalAp- No No Approval time in days provalTimeInDays for the country AverageApprov- No No Average time in days for alTimeInDays the country CyleTimeStandardDev No No Average randamization of patients across all trials for the indication. IterationID No Yes

Column(s) of “Fact.MonthlyRecruitment” Table Name Is PK Is FK Comment CountryID Yes Yes Country Identification Number. The primary key SubjectID Yes Yes Subject Identification MonthID Yes No Month Identification PatientPerSitePerMonth No No Number of patients per site per month SupportiveEvidence No No The xml document containing the supportive information. IsActiveFlag No No bit indicator for the validity of the record AuditETLID No No Reference to Audit. ExecutionLog key for auditing EnrollmentRateMonth- No No Standard Deviation lyStandardDev calculation of Enroll- ment rate per month LowerEnrollmentRateMonth- No No Lower side Standard lyStandardDev Deviation calculation of Enrollment rate per month UpperEnrollmentRateMonth- No No Upper side Standard lyStandardDev Deviation calculation of Enrollment rate per month PatientRandomized No No Randomized Average Number of patients at country level. DataSourceID No Yes Data Source Unique Identity Number

Column(s) of “Fact.PatientPrevalence” Table Name Is PK Is FK Comment CountryID Yes Yes Country Identification Number. The primary key SubjectID Yes Yes Subject Identification PrevalencePer No No The column contains the number of patient for calculation PrervalenceRate No No The prevalence rate calculated number SupportingEvidence No No The xml document containing the supportive information. IsActiveFlag No No bit indicator for the validity of the record AuditETLID No No Reference to Audit. ExecutionLog key for auditing DataSourceID No Yes Data Source Unique Identity Number PrevalencePerPopulation No No IncidencePer No No IncidencePerPopulation No No IncidenceSupportingEvidence No No

Column(s) of “Fact.Staffing” Table Name Is PK Is FK Comment CountryID Yes Yes Country Identification Number. The primary key SubjectID Yes Yes Subject Identification CRANumber No No Total number of CRA in a country SupportiveEvidence No No XML document containing evidentiary details of the findings in the fact. IsActiveFlag No No bit indicator for the validity of the record AuditETLID No No Reference to Audit. ExecutionLog key for auditing StaffTotal No No Total number of staff DataSourceID No Yes Data Source Unique Identity Number

Column(s) of “Fact.TrialSaturation” Table Name Is PK Is FK Comment CountryID Yes Yes Country Identification Number. The primary key SubjectID Yes Yes Subject Identification ActiveTrialCount No No Number of Active Trail information SupportiveEvidence No No The xml document containing the supportive information. IsActiveFlag No No bit indicator for the validity of the record AuditETLID No No Reference to Audit. ExecutionLog key for auditing DataSourceID No Yes Data Source Unique Identity Number

Exemplary Patient Data Model

FIG. 13 is an exemplary patient data model according to one embodiment of the present invention. In the embodiment shown in FIG. 13, a FactPatient table stores information associated with various patients. For example, patients in the FactPatient table may be assigned a unique patient identification number (PatientID). In this embodiment, the Patient ID can be associated with other information such as the patient's age, gender, year of birth, and/or other information. The PatientID may also be associated with information contained in the other tables and/or databases. For example, a Patient ID may be associated with a location identification (LocationID). In this embodiment, the LocationID for the patient corresponds with a location identification of a separate table (DimLocation). The LocationID) for the patient can be used to determine information such as a city, a state, and/or GPS coordinates associated with the PatientID based on the LocationID. As another example, a SubjectID associated with a particular PatientID may be used to determine a subject's name, molecule team, capture date, and/or other information contained in another table and/or database having a corresponding SubjectID. The embodiment shown in FIG. 13, depicts numerous other associations between information corresponding to PatientIDs and information in other tables and/or databases. Furthermore, numerous other embodiments are disclosed herein and variations are within the scope of this disclosure. Numerous other data models and variations of the data models described herein are likewise within the scope of this disclosure.

Below is a description of the various tables of a patient data model according to one embodiment of the present invention:

List of Tables for Patient Data Model Name Comment DataSourceOwner Dim.Concomi- ConcomitantMedicationn dimension tantMedication table contains name of the medication, class information. Dim.DataSource Dim.Ethnicity Ethnicity dimension table containsethnicity information about patient Dim.Location Location dimension table contains location information about country, state, city, latitude, and longitude. DIM.Subject Subject table is the dimensional table used to snapshot fact data by molecule team, search phrase, and date. Fact.Patient Fact table containing information for a patient, Subject Fact.PatientCo- MorbidityCondition Fact.PatientCon- comitantMedication ICD9.Codes Medication TreatmentType

Column(s) of “DataSourceOwner” Table Is Is Name Datatype Comment PK FK DataSourceOwnerID int Yes No DataSourceOwner nvarchar(50) No No CreationDate datetime No No LastModifiedDate datetime No No LastModifiedByID varbinary(85) No No IsActiveFlag bit No No

Column(s) of “Dim.ConcomitantMedication” Table Is Is Name Datatype Comment PK FK Concomi- int Concomitant Medication Yes No tantMedicationID Identity. The Primary Key of the table MedicationName nvarchar(50) Name of the medication No No MedicationClass nvarchar(50) Medical Class No No information. DDI int Drug Index No No NDC bigint Drug Index No No GPI bigint Drug Index No No DataSourceID int Data Source Identity No No

Column(s) of “Dim.DataSource” Table Is Is Name Datatype Comment PK FK DataSourceID int Data Source Yes No Unique Identity Number DataSourceName nvarchar(50) Name of the Data No No Source Provider DataSourceDe- nvarchar(255) Description of No No scription the Data Source FileDate datetime No No CreationDate datetime The date on No No which the data is being inserted LastModifiedDate datetime Last modified No No date LastModifiedByID varbinary(85) Last modified by No No Identity number IsActiveFlag bit Bit indicator No No for the validity of the record DataSourceOwnerID int Data Source No Yes Owner description

Column(s) of “Dim.Ethnicity” Table Is Is Name Datatype Comment PK FK EthnicityID int Ethnicity Identity. Yes No Primary key of the table Ethnicity nvarchar(50) Details about ethnicity No No NISNumber int No No SelfReferenceID int No No

Column(s) of “Dim.Location” Table Is Is Name Datatype Comment PK FK LocationID int Location identity number Yes No CountryID int Country identity number. No No Foriegn key referenced from Dim.Country table. State nvarchar(30) State information No No City nvarchar(30) City information No No AsciiName Nvarchar(200) No No Latitude float(19, 12) Latitude information No No Longitude float(19, 12) Longitude information No No GeoNameID int Geographical detail No No about a location SelfReference int Self referenced number No No

Column(s) of “DIM.Subject” Table Is Is Name Datatype Comment PK FK SubjectID integer Subject Yes No Identification SubjectName varchar(20) Subject Name No No MoleculeTeam char(18) Molecule team No No information CaptureDate char(18) The date on No No which the record was captured. IsLocked bit No No IsActiveVariableFlag bit No No ProcessFlag bit No No SubjectDetailDescription nvarchar(max) No No

Column(s) of “Fact.Patient” Table Is Is Name Datatype Comment PK FK PatientID int Patient Identification Yes No Number SubjectID integer Subject Identification No Yes EthnicityID int Ethnicity Identity. No Yes Primary key of the table LocationID int Location identity No Yes number PatientSourceID int This field indicates No No to the source for the patient information. DataSourceID int Data Source Identity No Yes number indicating the source of the data Gender varchar( ) Patient Gender No No Information Age numeric(,) Patient Age No No BirthYear int Date of Birth Year No No SourcePatientID int Source Patient No No Identity Ethnicity nvarchar(50) Information about No No Ethnicity AuditETLID int Reference to Audit. No No ExecutionLog key for auditing IsActiveFlag bit Bit indicator for No No the validity of the record BMI decimal(19, 12) Body Mass Index No No CreatinineValue decimal(19, 12) No No eGERValue decimal(19, 12) No No ProteinCreati- decimal(19, 12) No No nineRatio

Column(s) of “Fact.PatientCoMorbidityCondition” Table Is Is Name Datatype Comment PK FK PatientCoMor- int Patient CoMorbidity Yes No bidityCondtionID Condition Unique Identity Number. The Primary Key of the PatientID int Patient Identifi- No Yes cation Number. Foreign Key from Fact.PatientType CodeID int No No SubjectID integer Subject Identifi- No Yes cation Number. Foreign Key from Dim.Subject. MedcinID int Medication No No Identity Number Type nvarchar(55) No No Category nvarchar(55) No No Status nvarchar(55) No No OnSetDate datetime No No CreationDate datetime No No LastModifiedDate datetime No No LastModifiedByID varbinary(85) No No IsActiveFlag bit No No DataSourceID int Data Source No Yes Unique Identity Number ICD9CodeID int Code Identity No Yes DiagnosisTypeID int No No

Column(s) of “Fact.PatientConcomitantMedication” Table Is Is Name Datatype Comment PK FK PatientConcomi- int PatientConcomi- Yes No tantMedicationID tantMedicationID is the unique primary key to the table PatientID int Patient Identifi- No Yes cation Number ConcomitantMed- int Concomitant Medi- No Yes icationID cation Identity. The Primary Key of the table SubjectID integer Subject Identifi- No Yes cation Dose nvarchar(55) Dose of the Medi- No No cation Strength nvarchar(55) Strength of the No No Medication Form nvarchar(55) Form of the Medi- No No cation Units nvarchar(55) Unit of the Medi- No No cation Quantity nvarchar(55) Quantity of the No No Medication Status nvarchar(55) Status of the Medi- No No cation CreationDate datetime Bit indicator for No No the validity of the record LastModifiedDate datetime No No LastModifiedByID varbinary(85) No No IsActiveFlag bit No No DataSourceID int Data Source Unique No Yes Identity Number MedicationID int No Yes TreatmentTypeID int No Yes

Column(s) of “ICD9.Codes” Table Is Is Name Datatype Comment PK FK ICD9CodeID int Yes No IndicationID int No No IndicationGroupID int No No Code varchar(20) No No Description varchar(255) No No IndicationGroupCode varchar(20) No No ParentCodeID int No No DataSourceID int Data Source Identity No No number indicating the source of the data

Column(s) of “Medication” Table Is Is Name Datatype Comment PK FK MedicationiD int Yes No MedicationName nvarchar(75) No No DataSourceiD int No No

Column(s) of “TreatmentType” Table Is Is Name Datatype Comment PK FK TreatmentTypeiD int Yes No TreatmeniTvoeDescriotion nvarchar(50) No No CreationDate date No No

Advantages

Embodiments of the present invention provide many advantages over conventional methods of predicting the enrollment for clinical trials. For example, embodiments of the present invention allow subject identifications (Os) to be created through one or more user interfaces. In one embodiment, a user can create one or more subject IDs without technical expertise. For example, using one or more user interfaces, a user can create a new subject by entering or selecting a subject name, molecule team, and/or a capture date. A unique subject identification may be dynamically created for the subject In another embodiment, a user can update an existing subject. For example, a user may be able to add or update a capture date or other information associated with a particular subject ID. Based at least in part on the capture data, data from various tables and/or databases associated with a subject ID may be limited. For example, the capture date may provide a static point in time for which information contained in the tables and/or databases is available. Thus, if the information for a particular table and/or database specifies that the information is before the capture date, then the information is available to the subject identification. Alternatively, if the information for a particular table and/or database specifies that the information is after the capture date, then this information may not be available to the subject identification.

Embodiments of the present invention provide one or more core domains of information that may be used for analysis of a clinical trial plan. For example, patient domains, country domains, and/or investigator domains of information can be used according to one embodiment. In some embodiments, one or more core domains are built generically such that the system can manage data sources that are unknown at the time the core domain is created. In this way, using a generic structure that supports a domain, additional data sources can be added and/or updated as additional inthrmation and/or data sources become available.

Subject identifications may be associated with at least a portion of the information for one or more domains. For example, subject identifications may be associated with information contained in a patient domain, a country domain, an investigator domain, and/or other domains or data sources.

Once various parameters are chosen and associations between subject identifications and information in the domains have been created, embodiments of the present invention are able to take a mathematical approach to analyzing and presenting data regarding the actual investigators and investigation sites. The embodiments can then create graphical representations, e.g., line graphs that display information, such as predictions for likely scenarios based on average performance as well as best and worst-case scenarios based on outlier data.

General

Numerous specific details are set forth herein to provide a thorough understanding of the claimed subject matter. However, those skilled in the art will understand that the claimed subject matter may be practiced without these specific details. In other instances, methods, apparatuses or systems that would be known by one of ordinary skill have not been described in detail so as not to Obscure claimed subject matter.

Some portions are presented in terms of algorithms or symbolic representations of operations on data bits or binary digital signals stored within a computing system memory, such as a computer memory. These algorithmic descriptions or representations are examples of techniques used by those of ordinary skill in the data processing arts to convey the substance of their work to others skilled in the art. An algorithm is a self-consistent sequence of operations or similar processing leading to a desired result. In this context, operations or processing involves physical manipulation of physical quantities. Typically, although not necessarily, such quantities may take the form of electrical or magnetic signals capable of being stored, transferred, combined, compared or otherwise manipulated. It has proven convenient at times, principally for reasons of common usage, to refer to such signals as bits, data, values, elements, symbols, characters, terms, numbers, numerals or the like. It should be understood, however, that all of these and similar terms are to be associated with appropriate physical quantities and are merely convenient labels. Unless specifically stated otherwise, it is appreciated that throughout this specification discussions utilizing terms such as “processing,” “computing,” “calculating,” “determining,” and “identifying” or the like refer to actions or processes of a computing device, such as one or more computers or a similar electronic computing device or devices, that manipulate or transform data represented as physical electronic or magnetic quantities within memories, registers, or other information storage devices, transmission devices, or display devices of the computing platform.

The system or systems discussed herein are not limited to any particular hardware architecture or configuration. A computing device can include any suitable arrangement of components that provide a result conditioned on one or more inputs. Suitable computing devices include multipurpose microprocessor-based computer systems accessing stored software that programs or configures the computing system from a general purpose computing apparatus to a specialized computing apparatus implementing one or more embodiments of the present subject matter. Any suitable programming, scripting, or other type of language or combinations of languages may be used to implement the teachings contained herein in software to be used in programming or configuring a computing device.

Embodiments of the methods disclosed herein may be performed in the operation of such computing devices. The order of the blocks presented in the examples above can be varied—for example, blocks can be re-ordered, combined, and/or broken into sub-blocks. Certain blocks or processes can be performed in parallel.

The use of “adapted to” or “configured to” herein is meant as open and inclusive language that does not foreclose devices adapted to or configured to perform additional tasks or steps. Additionally, the use of “based on” is meant to be open and inclusive, in that a process, step, calculation, or other action “based on” one or more recited conditions or values may, in practice, be based on additional conditions or values beyond those recited. Headings, lists, and numbering included herein are for ease of explanation only and are not meant to be limiting.

While the present subject matter has been described in detail with respect to specific embodiments thereof, it will be appreciated that those skilled in the art, upon attaining an understanding of the foregoing may readily produce alterations to, variations of, and equivalents to such embodiments. Accordingly, it should be understood that the present disclosure has been presented for purposes of example rather than limitation, and does not preclude inclusion of such modifications, variations and/or additions to the present subject matter as would be readily apparent to one of ordinary skill in the art. 

That which is claimed is:
 1. A method comprising: receiving selection of a subject identification, the subject identification corresponding with at least a subject name; receiving selection of a patient parameter, the patient parameter corresponding with at least one of a physical characteristic or an illness; receiving selection of an investigator parameter, the investigator parameter corresponding with at least one of a trial, a phase, or a medical indicator; receiving selection of a geographic parameter, the geographic parameter corresponding with at least one of a geographic location or a geographic statistic; dynamically creating associations between the subject identification and the patient parameter, the investigator parameter, and the geographic parameter; and determining at least one potential site for a clinical trial based at least in part on the dynamically created associations.
 2. The method of claim 1, further comprising: conducting the clinical trial based at least in part on the determined at least one potential site.
 3. The method of claim 1, where receiving selection of the subject identification comprises: receiving selection of the subject name from a plurality of predefined subject names.
 4. The method of claim 1, where receiving selection of the subject identification comprises: receiving a first input, the first input indicating the subject name; receiving a second input, the second input indicating a molecule team; and dynamically creating the subject identification.
 5. The method of claim 1, wherein receiving selection of the patient parameter comprises receiving selection of at least one medical code, wherein the at least one medical code comprises at least one ICD9 code or ICD10 code.
 6. The method of claim 1, further comprising: in response to receiving the selection of the patient parameter, displaying a number of patients having the selected patient parameter.
 7. The method of claim 6, wherein the number of patients having the selected patient parameter is selected from a plurality of patient databases, each patient database in the plurality of patient databases corresponding to a same generic patient data model.
 8. The method of claim 1, wherein dynamically creating associations between the subject identification and the patient parameter, the investigator parameter, and the country parameter comprises: creating at least one association between the subject identification and at least one patient database comprising the patient parameter; creating at least one association between the subject identification and at least one investigator database comprising the investigator parameter; and creating at least one association between the subject identification and at least one country database comprising the country parameter.
 9. The method of claim 8, wherein each patient database corresponds to a same patient data model, wherein each investigator database corresponds to a same investigator data model, and wherein each country database corresponds to a same country data model.
 10. The method of claim 8, further comprising: receiving selection of a capture date; and updating the subject identification such that only information contained in the at least one patient database, the at least one investigator database, and the at least one country database on or before the capture date is available for the selected subject identification.
 11. The method of claim 1, further comprising: creating a graphical representation based at least in part on the dynamically created associations, the graphical representation providing a prediction for one or more scenarios.
 12. The method of claim 11, wherein the prediction corresponds to at least one of a likely scenario based at least on part on average performance, a best-case scenario, or a worst-case scenario.
 13. A computer-readable medium comprising program code for: receiving selection of a subject identification, the subject identification corresponding with at least a subject name; receiving selection of a patient parameter, the patient parameter corresponding with at least one of a physical characteristic or an illness; receiving selection of an investigator parameter, the investigator parameter corresponding with at least one of a trial, a phase, or a medical indicator; receiving selection of a geographic parameter, the geographic parameter corresponding with at least one of a geographic location or a geographic statistic; dynamically creating associations between the subject identification and the patient parameter, the investigator parameter, and the geographic parameter; and sending information corresponding to at least one of the dynamically created associations to a clinical trial analysis application for conducting a clinical trial.
 14. The computer-readable medium of claim 13, further comprising program code for: receiving a capture date, wherein only information contained in at least one patient database, at least one investigator database, and at least one country database as of the capture date is available for the subject identification.
 15. The computer-readable medium of claim 13, further comprising program code for: creating a graphical representation for the clinical trial based at least in part on the dynamically created associations.
 16. The computer-readable medium of claim 13, further comprising program code for: associating a capture date with the subject identification such that data on or before the capture date can be maintained in a database while allowing the database to continue to receive additional data after the capture date.
 17. A system, comprising: a plurality of databases; an electronic device comprising: an input device; a display; and a processor in communication with the input device, the display, the plurality of databases, the processor configured for: receiving selection of a subject identification, the subject identification corresponding with at least a subject name; receiving selection of a patient parameter, the patient parameter corresponding with at least one of a physical characteristic or an illness; receiving selection of an investigator parameter, the investigator parameter corresponding with at least one of a trial, a phase, or a medical indicator; receiving selection of a geographic parameter, the geographic parameter corresponding with at least one of a geographic location or a geographic statistic; dynamically creating associations between the subject identification and the plurality of databases for use in a clinical trial analysis application, the associations comprising a first association between the subject identification and the patient parameter, a second association between the subject identification and the investigator parameter, and a third association between the subject identification and the country parameter.
 18. The system of claim 17, further comprising: a network; and a server comprising: a memory, wherein the memory comprises program code for the clinical trial analysis application; a second processor, the second processor in communication with the memory, the second processor configured for: executing the program code for the clinical trial analysis application; receiving information corresponding to at least one of the dynamically created associations from the electronic device through the network; and conducting a clinical trial using at least the clinical trial analysis application and the received information.
 19. The system of claim 17, wherein the plurality of databases comprises a patient database, an investigator database, and a country database, and wherein the processor is further configured for: querying the patient database with the patient parameter; querying the investigator database with the investigator parameter; and querying the country database with the country parameter;
 20. The system of claim 17, wherein the processor is further configured for: receiving a capture date; and filtering information in the plurality of databases such that only information in the plurality of databases on or before the capture date is available to the subject identification and such that new information can be added to the plurality of databases after the capture date and the new information is not available to the subject identification. 